341 research outputs found

    Vote-boosting ensembles

    Full text link
    Vote-boosting is a sequential ensemble learning method in which the individual classifiers are built on different weighted versions of the training data. To build a new classifier, the weight of each training instance is determined in terms of the degree of disagreement among the current ensemble predictions for that instance. For low class-label noise levels, especially when simple base learners are used, emphasis should be made on instances for which the disagreement rate is high. When more flexible classifiers are used and as the noise level increases, the emphasis on these uncertain instances should be reduced. In fact, at sufficiently high levels of class-label noise, the focus should be on instances on which the ensemble classifiers agree. The optimal type of emphasis can be automatically determined using cross-validation. An extensive empirical analysis using the beta distribution as emphasis function illustrates that vote-boosting is an effective method to generate ensembles that are both accurate and robust

    Using boosting to prune bagging ensembles

    Full text link
    This is the author’s version of a work that was accepted for publication in Pattern Recognition Letters. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters 28.1 (2007): 156 – 165, DOI: 10.1016/j.patrec.2006.06.018Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory for storage, classify faster and can improve the generalization accuracy of the original bagging ensemble. In all the classification problems investigated pruned ensembles with 20 % of the original classifiers show statistically significant improvements over bagging. In problems where boosting is superior to bagging, these improvements are not sufficient to reach the accuracy of the corresponding boosting ensembles. However, ensemble pruning preserves the performance of bagging in noisy classification tasks, where boosting often has larger generalization errors. Therefore, pruned bagging should generally be preferred to complete bagging and, if no information about the level of noise is available, it is a robust alternative to AdaBoost.The authors acknowledge financial support from the Spanish Dirección General de Investigación, project TIN2004-07676-C02-02

    Using a SPOC to flip the classroom

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. G. Martínez-Muñoz and E. Pulido, "Using a SPOC to flip the classroom," 2015 IEEE Global Engineering Education Conference (EDUCON), Tallinn, 2015, pp. 431-436. doi: 10.1109/EDUCON.2015.7096007The benefits of using SPOC platforms into face-to-face education are yet to be completely analysed. In this work we propose to use SPOCs and video contents to flip the classroom. The objective is to improve the involvement and satisfaction of students with the course, to reduce the drop-out rates and to improve the face-to-face course success rate. We apply these ideas to an undergraduate first year course on Data Structures and Algorithms. The study is validated by collecting data from two consecutive editions of the course, one in which the flipped classroom model and videos were used and other in which they were not. The gathered data included online data about the students' interaction with the SPOC materials and offline data collected during lectures and exams. In the edition where the SPOC materials were available, we have observed a correlation between the students' final marks and their percentage rate of video accesses with respect to the total number of accesses, which indicates a better academic performance for students who prefer videos over documents.Authors acknowledge financial support from the Spanish Dirección General de Investigación project TIN2013-42351-P and from Comunidad de Madrid project CASI-CAM S2013/ICE-2845

    Pruning in ordered bagging ensembles

    Full text link
    This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ICML '06 Proceedings of the 23rd international conference on Machine learning, http://dx.doi.org/10.1145/1143844.1143921We present a novel ensemble pruning method based on reordering the classifiers obtained from bagging and then selecting a subset for aggregation. Ordering the classifiers generated in bagging makes it possible to build subensembles of increasing size by including first those classifiers that are expected to perform best when aggregated. Ensemble pruning is achieved by halting the aggregation process before all the classifiers generated are included into the ensemble. Pruned subensembles containing between 15% and 30% of the initial pool of classifiers, besides being smaller, improve the generalization performance of the full bagging ensemble in the classification problems investigated.The authors acknowledge financial support from the Spanish DirecciĂłn General de InvestigaciĂłn, project TIN2004-07676-C02-02

    Using all data to generate decision tree ensembles

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. G. Martinez-Munoz, A. Suarez, "Using all data to generate decision tree ensembles", in IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 34, 4 (2004), p. 393-397This paper develops a new method to generate ensembles of classifiers that uses all available data to construct every individual classifier. The base algorithm builds a decision tree in an iterative manner: The training data are divided into two subsets. In each iteration, one subset is used to grow the decision tree, starting from the decision tree produced by the previous iteration. This fully grown tree is then pruned by using the other subset. The roles of the data subsets are interchanged in every iteration. This process converges to a final tree that is stable with respect to the combined growing and pruning steps. To generate a variety of classifiers for the ensemble, we randomly create the subsets needed by the iterative tree construction algorithm. The method exhibits good performance in several standard datasets at low computational cost

    A Gradient Boosting Approach for Training Convolutional and Deep Neural Networks

    Full text link
    Deep learning has revolutionized the computer vision and image classification domains. In this context Convolutional Neural Networks (CNNs) based architectures are the most widely applied models. In this article, we introduced two procedures for training Convolutional Neural Networks (CNNs) and Deep Neural Network based on Gradient Boosting (GB), namely GB-CNN and GB-DNN. These models are trained to fit the gradient of the loss function or pseudo-residuals of previous models. At each iteration, the proposed method adds one dense layer to an exact copy of the previous deep NN model. The weights of the dense layers trained on previous iterations are frozen to prevent over-fitting, permitting the model to fit the new dense as well as to fine-tune the convolutional layers (for GB-CNN) while still utilizing the information already learned. Through extensive experimentation on different 2D-image classification and tabular datasets, the presented models show superior performance in terms of classification accuracy with respect to standard CNN and Deep-NN with the same architectures

    Sequential Training of Neural Networks with Gradient Boosting

    Full text link
    This paper presents a novel technique based on gradient boosting to train a shallow neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a given function. A neural network can also be seen as an additive model where the scalar product of the responses of the last hidden layer and its weights provide the final output of the network. Instead of training the network as a whole, the proposed algorithm trains the network sequentially in TT steps. First, the bias term of the network is initialized with a constant approximation that minimizes the average loss of the data. Then, at each step, a portion of the network, composed of JJ neurons, is trained to approximate the pseudo-residuals on the training data computed from the previous iterations. Finally, the TT partial models and bias are integrated as a single NN with TĂ—JT \times J neurons in the hidden layer. Extensive experiments in classification and regression tasks are carried out showing a competitive generalization performance with respect to neural networks trained with different standard solvers, such as Adam, L-BFGS and SGD. Furthermore, we show that the proposed method design permits to switch off a number of hidden units during test (the units that were last trained) without a significant reduction of its generalization ability. This permits the adaptation of the model to different classification speed requirements on the fly

    Improving the robustness of bagging with reduced sampling size

    Full text link
    This is an electronic version of the paper presented at the 22th European Symposium on Artificial Neural Networks, held in Bruges on 2014Bagging is a simple and robust classification algorithm in the presence of class label noise. This algorithm builds an ensemble of classifiers by bootstrapping samples with replacement of size equal to the original training set. However, several studies have shown that this choice of sampling size is arbitrary in terms of generalization performance of the ensemble. In this study we discuss how small sampling ratios can contribute to the robustness of bagging in the presence of class label noise. An empirical analysis on two datasets is carried out using different noise rates and bootstrap sampling sizes. The results show that, for the studied datasets, sampling rates of 20% clearly improve the performance of the bagging ensembles in the presence of class label noise.The authors acknowledge financial support from the Spanish DirecciĂłn General de InvestigaciĂłn, project TIN2010-21575-C02-0

    Small margin ensembles can be robust to class-label noise

    Full text link
    This is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neurocomputing, VOL 160 (2015) DOI 10.1016/j.neucom.2014.12.086Subsampling is used to generate bagging ensembles that are accurate and robust to class-label noise. The effect of using smaller bootstrap samples to train the base learners is to make the ensemble more diverse. As a result, the classification margins tend to decrease. In spite of having small margins, these ensembles can be robust to class-label noise. The validity of these observations is illustrated in a wide range of synthetic and real-world classification tasks. In the problems investigated, subsampling significantly outperforms standard bagging for different amounts of class-label noise. By contrast, the effectiveness of subsampling in random forest is problem dependent. In these types of ensembles the best overall accuracy is obtained when the random trees are built on bootstrap samples of the same size as the original training data. Nevertheless, subsampling becomes more effective as the amount of class-label noise increases.The authors acknowledge financial support from Spanish Plan Nacional I+D+i Grant TIN2013-42351-P and from Comunidad de Madrid Grant S2013/ICE-2845 CASI-CAM-CM

    An urn model for majority voting in classification ensembles

    Full text link
    In this work we analyze the class prediction of parallel randomized ensembles by majority voting as an urn model. For a given test instance, the ensemble can be viewed as an urn of marbles of different colors. A marble represents an individual classifier. Its color represents the class label prediction of the corresponding classifier. The sequential querying of classifiers in the ensemble can be seen as draws without replacement from the urn. An analysis of this classical urn model based on the hypergeometric distribution makes it possible to estimate the confidence on the outcome of majority voting when only a fraction of the individual predictions is known. These estimates can be used to speed up the prediction by the ensemble. Specifically, the aggregation of votes can be halted when the confidence in the final prediction is sufficiently high. If one assumes a uniform prior for the distribution of possible votes the analysis is shown to be equivalent to a previous one based on Dirichlet distributions. The advantage of the current approach is that prior knowledge on the possible vote outcomes can be readily incorporated in a Bayesian framework. We show how incorporating this type of problem-specific knowledge into the statistical analysis of majority voting leads to faster classification by the ensemble and allows us to estimate the expected average speed-up beforehandThe authors acknowledge financial support from the Comunidad de Madrid (project CASI-CAMCM S2013/ICE-2845), and from the Spanish Ministerio de EconomĂ­a y Competitividad (projects TIN2013-42351-P and TIN2015-70308-REDT
    • …
    corecore